Robust speech recognition over packet networks: an overview
نویسندگان
چکیده
Conventional circuit-switched networks are increasingly being replaced by packet-based networks for voice communication applications. Additionally, there has been an increased deployment of services supporting speech based interactions. These trends demand reliable transmission of speech data not just for playback but also to ensure acceptable automatic speech recognition (ASR) performance. In this paper, we present an overview of techniques that have been investigated to improve ASR performance against two major degradation factors in the context of packet networks: (1) information loss due to a low bit-rate codec and (2) packet loss due to channel (network) conditions. In addition, we highlight another key issue, packet loss rate, by showing ASR performance as a function of packet size and channel condition.
منابع مشابه
An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition
Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...
متن کاملOverview of compression and packet loss effects in speech biometrics - Vision, Image and Signal Processing, IEE Proceedings-
An overview is presented of compression and packet loss effects in speech biometrics. These new problems appear particularly in recent applications of biometrics over mobile or Internet networks. The influence of speech compression on speaker recognition performance in mobile networks is investigated. In a first experiment, it is found that the use of GSM coding degrades the performance. In a s...
متن کاملRobust speech recognition against packet loss
Recognizing speech transmitted over mobile or computer networks poses new challenges such as packet loss in transmission. Viterbi algorithm, the most common speech recognition approach, searches for the most likely state sequence that explains all observation. However, because it implicitly sums the log observation probabilities, the resulting solution is sensitive to outlier frames. In this pa...
متن کاملGraceful degradation of speech recognition performance over lossy packet networks
This paper explores packet loss recovery in client-server Automatic Speech Recognition (ASR) systems. A forward error correction (FEC) system is designed and tested over several channel loss models, at variable amounts of data acquisition delay. In experiments with simulated packet loss, the FEC system provides robust ASR performance which degrades gracefully as packet loss rates increase. Comp...
متن کاملNoise-Robust speech recognition of Co
Over the past several years, the primary focus of investigation for speech recognition has been over the telephone or IP network. Recently more and more IP telephony has been extensively used. This paper describes the performance of a speech recognizer on noisy speech transmitted over an H.323 IP telephony network, where the minimum mean-square error log spectra amplitude (MMSE-LSA) method [1,2...
متن کامل